Nature of Protein Family Signatures: Insights from Singular Value Analysis of Position-Specific Scoring Matrices

نویسندگان

  • Akira R. Kinjo
  • Haruki Nakamura
چکیده

Position-specific scoring matrices (PSSMs) are useful for detecting weak homology in protein sequence analysis, and they are thought to contain some essential signatures of the protein families. In order to elucidate what kind of ingredients constitute such family-specific signatures, we apply singular value decomposition to a set of PSSMs and examine the properties of dominant right and left singular vectors. The first right singular vectors were correlated with various amino acid indices including relative mutability, amino acid composition in protein interior, hydropathy, or turn propensity, depending on proteins. A significant correlation between the first left singular vector and a measure of site conservation was observed. It is shown that the contribution of the first singular component to the PSSMs act to disfavor potentially but falsely functionally important residues at conserved sites. The second right singular vectors were highly correlated with hydrophobicity scales, and the corresponding left singular vectors with contact numbers of protein structures. It is suggested that sequence alignment with a PSSM is essentially equivalent to threading supplemented with functional information. In addition, singular vectors may be useful for analyzing and annotating the characteristics of conserved sites in protein families.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Singular value inequalities for positive semidefinite matrices

In this note‎, ‎we obtain some singular values inequalities for positive semidefinite matrices by using block matrix technique‎. ‎Our results are similar to some inequalities shown by Bhatia and Kittaneh in [Linear Algebra Appl‎. ‎308 (2000) 203-211] and [Linear Algebra Appl‎. ‎428 (2008) 2177-2191]‎.

متن کامل

MulPSSM: a database of multiple position-specific scoring matrices of protein domain families

Representation of multiple sequence alignments of protein families in terms of position-specific scoring matrices (PSSMs) is commonly used in the detection of remote homologues. A PSSM is generated with respect to one of the sequences involved in the multiple sequence alignment as a reference. We have shown recently that the use of multiple PSSMs corresponding to an alignment, with several sequ...

متن کامل

Singular values of convex functions of matrices

‎Let $A_{i},B_{i},X_{i},i=1,dots,m,$ be $n$-by-$n$ matrices such that $‎sum_{i=1}^{m}leftvert A_{i}rightvert ^{2}$ and $‎sum_{i=1}^{m}leftvert B_{i}rightvert ^{2}$  are nonzero matrices and each $X_{i}$ is‎ ‎positive semidefinite‎. ‎It is shown that if $f$ is a nonnegative increasing ‎convex function on $left[ 0,infty right) $ satisfying $fleft( 0right)‎ ‎=0 $‎, ‎then  $$‎2s_{j}left( fleft( fra...

متن کامل

New Solutions for Singular Lane-Emden Equations Arising in Astrophysics Based on Shifted Ultraspherical Operational Matrices of Derivatives

In this paper, the ultraspherical operational matrices of derivatives are constructed. Based on these operational matrices, two numerical algorithms are presented and analyzed for obtaining new approximate spectral solutions of a class of linear and nonlinear Lane-Emden type singular initial value problems. The basic idea behind the suggested algorithms is basically built on transforming the eq...

متن کامل

eBLOCKs: enumerating conserved protein blocks to achieve maximal sensitivity and specificity

Classifying proteins into families and superfamilies allows identification of functionally important conserved domains. The motifs and scoring matrices derived from such conserved regions provide computational tools that recognize similar patterns in novel sequences, and thus enable the prediction of protein function for genomes. The eBLOCKs database enumerates a cascade of protein blocks with ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • PLoS ONE

دوره 3  شماره 

صفحات  -

تاریخ انتشار 2008